AITopics | alphageometry 2

Collaborating Authors

alphageometry 2

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Chen, Luoxin, Gu, Jinming, Huang, Liankai, Huang, Wenhao, Jiang, Zhicheng, Jie, Allan, Jin, Xiaoran, Jin, Xing, Li, Chenggang, Ma, Kaijing, Ren, Cheng, Shen, Jiawei, Shi, Wenlei, Sun, Tong, Sun, He, Wang, Jiahui, Wang, Siran, Wang, Zhihong, Wei, Chenrui, Wei, Shufa, Wu, Yonghui, Wu, Yuchen, Xia, Yihang, Xin, Huajian, Yang, Fan, Ying, Huaiyuan, Yuan, Hongyi, Yuan, Zheng, Zhan, Tianyang, Zhang, Chi, Zhang, Yue, Zhang, Ge, Zhao, Tianyun, Zhao, Jianqiu, Zhou, Yichi, Zhu, Thomas Hanwen

arXiv.org Artificial IntelligenceAug-4-2025

LLMs have demonstrated strong mathematical reasoning abilities by leveraging reinforcement learning with long chain-of-thought, yet they continue to struggle with theorem proving due to the lack of clear supervision signals when solely using natural language. Dedicated domain-specific languages like Lean provide clear supervision via formal verification of proofs, enabling effective training through reinforcement learning. In this work, we propose \textbf{Seed-Prover}, a lemma-style whole-proof reasoning model. Seed-Prover can iteratively refine its proof based on Lean feedback, proved lemmas, and self-summarization. To solve IMO-level contest problems, we design three test-time inference strategies that enable both deep and broad reasoning. Seed-Prover proves $78.1\%$ of formalized past IMO problems, saturates MiniF2F, and achieves over 50\% on PutnamBench, outperforming the previous state-of-the-art by a large margin. To address the lack of geometry support in Lean, we introduce a geometry reasoning engine \textbf{Seed-Geometry}, which outperforms previous formal geometry engines. We use these two systems to participate in IMO 2025 and fully prove 5 out of 6 problems. This work represents a significant advancement in automated mathematical reasoning, demonstrating the effectiveness of formal verification with long chain-of-thought reasoning.

large language model, machine learning, seed-prover, (17 more...)

arXiv.org Artificial Intelligence

2507.23726

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Google DeepMind takes step closer to cracking top-level maths

The GuardianJul-25-2024, 16:59:48 GMT

Even though computers were made to do maths faster than any human could manage, the top level of formal mathematics remains an exclusively human domain. But a breakthrough by researchers at Google DeepMind has brought AI systems closer than ever to beating the best human mathematicians at their own game. A pair of new systems, called AlphaProof and AlphaGeometry 2, worked together to tackle questions from the International Mathematical Olympiad, a global maths competition for secondary-school students that has been running since 1959. The Olympiad takes the form of six mind-bogglingly hard questions each year, covering fields including algebra, geometry and number theory. The combined efforts of DeepMind's two systems weren't quite in that league.

alphageometry 2, google deepmind take step closer, top-level math, (9 more...)

The Guardian

Industry: Education (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.98)

Add feedback